102 research outputs found
Correlating neural and symbolic representations of language
Analysis methods which enable us to better understand the representations and
functioning of neural models of language are increasingly needed as deep
learning becomes the dominant approach in NLP. Here we present two methods
based on Representational Similarity Analysis (RSA) and Tree Kernels (TK) which
allow us to directly quantify how strongly the information encoded in neural
activation patterns corresponds to information represented by symbolic
structures such as syntax trees. We first validate our methods on the case of a
simple synthetic language for arithmetic expressions with clearly defined
syntax and semantics, and show that they exhibit the expected pattern of
results. We then apply our methods to correlate neural representations of
English sentences with their constituency parse trees.Comment: ACL 201
Learning to Understand Child-directed and Adult-directed Speech
Speech directed to children differs from adult-directed speech in linguistic
aspects such as repetition, word choice, and sentence length, as well as in
aspects of the speech signal itself, such as prosodic and phonemic variation.
Human language acquisition research indicates that child-directed speech helps
language learners. This study explores the effect of child-directed speech when
learning to extract semantic information from speech directly. We compare the
task performance of models trained on adult-directed speech (ADS) and
child-directed speech (CDS). We find indications that CDS helps in the initial
stages of learning, but eventually, models trained on ADS reach comparable task
performance, and generalize better. The results suggest that this is at least
partially due to linguistic rather than acoustic properties of the two
registers, as we see the same pattern when looking at models trained on
acoustically comparable synthetic speech.Comment: Authors found an error in preprocessing of transcriptions before they
were fed to SBERT. After correction, the experiments were rerun. The updated
results can be found in this version. Importantly, - Most scores were
affected to a small degree (performance was slightly worse). - The effect was
consistent across conditions. Therefore, the general patterns remain the sam
Learning language through pictures
We propose Imaginet, a model of learning visually grounded representations of
language from coupled textual and visual input. The model consists of two Gated
Recurrent Unit networks with shared word embeddings, and uses a multi-task
objective by receiving a textual description of a scene and trying to
concurrently predict its visual representation and the next word in the
sentence. Mimicking an important aspect of human language learning, it acquires
meaning representations for individual words from descriptions of visual
scenes. Moreover, it learns to effectively use sequential structure in semantic
interpretation of multi-word phrases.Comment: To appear at ACL 201
Analyzing and Interpreting Neural Networks for NLP: A Report on the First BlackboxNLP Workshop
The EMNLP 2018 workshop BlackboxNLP was dedicated to resources and techniques
specifically developed for analyzing and understanding the inner-workings and
representations acquired by neural models of language. Approaches included:
systematic manipulation of input to neural networks and investigating the
impact on their performance, testing whether interpretable knowledge can be
decoded from intermediate representations acquired by neural networks,
proposing modifications to neural network architectures to make their knowledge
state or generated output more explainable, and examining the performance of
networks on simplified or formal languages. Here we review a number of
representative studies in each category
Analyzing analytical methods: The case of phonology in neural models of spoken language
Given the fast development of analysis techniques for NLP and speech
processing systems, few systematic studies have been conducted to compare the
strengths and weaknesses of each method. As a step in this direction we study
the case of representations of phonology in neural network models of spoken
language. We use two commonly applied analytical techniques, diagnostic
classifiers and representational similarity analysis, to quantify to what
extent neural activation patterns encode phonemes and phoneme sequences. We
manipulate two factors that can affect the outcome of analysis. First, we
investigate the role of learning by comparing neural activations extracted from
trained versus randomly-initialized models. Second, we examine the temporal
scope of the activations by probing both local activations corresponding to a
few milliseconds of the speech signal, and global activations pooled over the
whole utterance. We conclude that reporting analysis results with randomly
initialized models is crucial, and that global-scope methods tend to yield more
consistent results and we recommend their use as a complement to local-scope
diagnostic methods.Comment: ACL 202
Encoding of phonology in a recurrent neural model of grounded speech
We study the representation and encoding of phonemes in a recurrent neural
network model of grounded speech. We use a model which processes images and
their spoken descriptions, and projects the visual and auditory representations
into the same semantic space. We perform a number of analyses on how
information about individual phonemes is encoded in the MFCC features extracted
from the speech signal, and the activations of the layers of the model. Via
experiments with phoneme decoding and phoneme discrimination we show that
phoneme representations are most salient in the lower layers of the model,
where low-level signals are processed at a fine-grained level, although a large
amount of phonological information is retain at the top recurrent layer. We
further find out that the attention mechanism following the top recurrent layer
significantly attenuates encoding of phonology and makes the utterance
embeddings much more invariant to synonymy. Moreover, a hierarchical clustering
of phoneme representations learned by the network shows an organizational
structure of phonemes similar to those proposed in linguistics.Comment: Accepted at CoNLL 201
Revisiting the Hierarchical Multiscale LSTM
Hierarchical Multiscale LSTM (Chung et al., 2016a) is a state-of-the-art
language model that learns interpretable structure from character-level input.
Such models can provide fertile ground for (cognitive) computational
linguistics studies. However, the high complexity of the architecture, training
procedure and implementations might hinder its applicability. We provide a
detailed reproduction and ablation study of the architecture, shedding light on
some of the potential caveats of re-purposing complex deep-learning
architectures. We further show that simplifying certain aspects of the
architecture can in fact improve its performance. We also investigate the
linguistic units (segments) learned by various levels of the model, and argue
that their quality does not correlate with the overall performance of the model
on language modeling.Comment: To appear in COLING 2018 (reproduction track
Quantifying cross-linguistic influence with a computational model: A study of case-marking comprehension
Cross-linguistic influence (CLI) is one of the key phenomena in bilingual and second language learning. We propose a method for quantifying CLI in the use of linguistic constructions with the help of a computational model, which acquires constructions in two languages from bilingual input. We focus on the acquisition of case-marking cues in Russian and German and simulate two experiments that employ a picture-choice task tapping into the mechanisms of sentence interpretation. Our model yields behavioral patterns similar to human, and these patterns can be explained by the amount of CLI: the negative CLI in high amounts leads to the misinterpretation of participant roles in Russian and German object-verb-subject sentences. Finally, we make two novel predictions about the acquisition of case-marking cues in Russian and German. Most importantly, our simulations suggest that the high degree of positive CLI may facilitate the interpretation of object-verb-subject sentences
- …